AITopics | functional derivative

Collaborating Authors

functional derivative

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Supplementary Material: Repulsive Deep Ensembles are Bayesian ANon-identifiable neural networks

Neural Information Processing SystemsApr-24-2026, 23:34:28 GMT

Deep neural networks are parametric models able to learn complex non-linear functions from few training instances and thus can be deployed to solve many tasks. Their overparameterized architecture, characterized by a number of parameters far larger than that of training data points, enables them to retain entire datasets even with random labels [84]. Even more, this overparameterized regime makes neural network approximations of a given function not unique in the sense that multiple configurations of weights might lead to the same function. Indeed, the output of a feed forward neural network given some fixed input remains unchanged under a set of transformations. For instance, certain weight permutations and sign flips in MLPs leave the output unchanged [9].

artificial intelligence, machine learning, neural network, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

1c63926ebcabda26b5cdb31b5cc91efb-Supplemental.pdf

Neural Information Processing SystemsFeb-7-2026, 17:36:12 GMT

dx 0, neural network, random seed, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Inference-Aware Meta-Alignment of LLMs via Non-Linear GRPO

Takakura, Shokichi, Wachi, Akifumi, Higuchi, Rei, Miyaguchi, Kohei, Suzuki, Taiji

arXiv.org Machine LearningFeb-3-2026

Aligning large language models (LLMs) to diverse human preferences is fundamentally challenging since criteria can often conflict with each other. Inference-time alignment methods have recently gained popularity as they allow LLMs to be aligned to multiple criteria via different alignment algorithms at inference time. However, inference-time alignment is computationally expensive since it often requires multiple forward passes of the base model. In this work, we propose inference-aware meta-alignment (IAMA), a novel approach that enables LLMs to be aligned to multiple criteria with limited computational budget at inference time. IAMA trains a base model such that it can be effectively aligned to multiple tasks via different inference-time alignment algorithms. To solve the non-linear optimization problems involved in IAMA, we propose non-linear GRPO, which provably converges to the optimal solution in the space of probability measures.

artificial intelligence, large language model, natural language, (12 more...)

arXiv.org Machine Learning

2602.01603

Country:

Asia > Singapore (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Hamiltonian Neural PDE Solvers through Functional Approximation

Zhou, Anthony, Farimani, Amir Barati

arXiv.org Artificial IntelligenceSep-30-2025

Designing neural networks within a Hamiltonian framework offers a principled way to ensure that conservation laws are respected in physical systems. While promising, these capabilities have been largely limited to discrete, analytically solvable systems. In contrast, many physical phenomena are governed by PDEs, which govern infinite-dimensional fields through Hamiltonian functionals and their functional derivatives. Building on prior work, we represent the Hamiltonian functional as a kernel integral parameterized by a neural field, enabling learnable function-to-scalar mappings and the use of automatic differentiation to calculate functional derivatives. This allows for an extension of Hamiltonian mechanics to neural PDE solvers by predicting a functional and learning in the gradient domain. We show that the resulting Hamiltonian Neural Solver (HNS) can be an effective surrogate model through improved stability and conserving energy-like quantities across 1D and 2D PDEs. This ability to respect conservation laws also allows HNS models to better generalize to longer time horizons or unseen initial conditions.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2505.13275

Genre: Research Report > Experimental Study (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Physics-informed Neural Networks for Functional Differential Equations: Cylindrical Approximation and Its Convergence Guarantees

Neural Information Processing SystemsAug-20-2025, 21:56:58 GMT

We propose the first learning scheme for functional differential equations (FDEs). FDEs play a fundamental role in physics, mathematics, and optimal control.

artificial intelligence, deep learning, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Genre: Research Report > Experimental Study (1.00)

Industry: Energy > Oil & Gas > Upstream (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Direct Distributional Optimization for Provable Alignment of Diffusion Models

Kawata, Ryotaro, Oko, Kazusato, Nitanda, Atsushi, Suzuki, Taiji

arXiv.org Artificial IntelligenceFeb-5-2025

We introduce a novel alignment method for diffusion models from distribution optimization perspectives while providing rigorous convergence guarantees. We first formulate the problem as a generic regularized loss minimization over probability distributions and directly optimize the distribution using the Dual Averaging method. Next, we enable sampling from the learned distribution by approximating its score function via Doob's $h$-transform technique. The proposed framework is supported by rigorous convergence guarantees and an end-to-end bound on the sampling error, which imply that when the original distribution's score is known accurately, the complexity of sampling from shifted distributions is independent of isoperimetric conditions. This framework is broadly applicable to general distribution optimization problems, including alignment tasks in Reinforcement Learning with Human Feedback (RLHF), Direct Preference Optimization (DPO), and Kahneman-Tversky Optimization (KTO). We empirically validate its performance on synthetic and image datasets using the DPO objective.

artificial intelligence, diffusion model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2502.02954

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Singapore (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Physics-informed Neural Networks for Functional Differential Equations: Cylindrical Approximation and Its Convergence Guarantees

Miyagawa, Taiki, Yokota, Takeru

arXiv.org Machine LearningOct-23-2024

We propose the first learning scheme for functional differential equations (FDEs). FDEs play a fundamental role in physics, mathematics, and optimal control. However, the numerical analysis of FDEs has faced challenges due to its unrealistic computational costs and has been a long standing problem over decades. Thus, numerical approximations of FDEs have been developed, but they often oversimplify the solutions. To tackle these two issues, we propose a hybrid approach combining physics-informed neural networks (PINNs) with the \textit{cylindrical approximation}. The cylindrical approximation expands functions and functional derivatives with an orthonormal basis and transforms FDEs into high-dimensional PDEs. To validate the reliability of the cylindrical approximation for FDE applications, we prove the convergence theorems of approximated functional derivatives and solutions. Then, the derived high-dimensional PDEs are numerically solved with PINNs. Through the capabilities of PINNs, our approach can handle a broader class of functional derivatives more efficiently than conventional discretization-based methods, improving the scalability of the cylindrical approximation. As a proof of concept, we conduct experiments on two FDEs and demonstrate that our model can successfully achieve typical $L^1$ relative error orders of PINNs $\sim 10^{-3}$. Overall, our work provides a strong backbone for physicists, mathematicians, and machine learning experts to analyze previously challenging FDEs, thereby democratizing their numerical analysis, which has received limited attention. Code is available at \url{https://github.com/TaikiMiyagawa/FunctionalPINN}.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Machine Learning

2410.18153

Country: North America > United States (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry: Energy > Oil & Gas > Upstream (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Understanding Transfer Learning via Mean-field Analysis

Aminian, Gholamali, Szpruch, Łukasz, Cohen, Samuel N.

arXiv.org Machine LearningOct-23-2024

We propose a novel framework for exploring generalization errors of transfer learning through the lens of differential calculus on the space of probability measures. In particular, we consider two main transfer learning scenarios, $\alpha$-ERM and fine-tuning with the KL-regularized empirical risk minimization and establish generic conditions under which the generalization error and the population risk convergence rates for these scenarios are studied. Based on our theoretical results, we show the benefits of transfer learning with a one-hidden-layer neural network in the mean-field regime under some suitable integrability and regularity assumptions on the loss and activation functions.

artificial intelligence, machine learning, neural network, (17 more...)

arXiv.org Machine Learning

2410.17128

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Mean-field Analysis of Generalization Errors

Aminian, Gholamali, Cohen, Samuel N., Szpruch, Łukasz

arXiv.org Artificial IntelligenceJun-20-2023

We propose a novel framework for exploring weak and $L_2$ generalization errors of algorithms through the lens of differential calculus on the space of probability measures. Specifically, we consider the KL-regularized empirical risk minimization problem and establish generic conditions under which the generalization error convergence rate, when training on a sample of size $n$, is $\mathcal{O}(1/n)$. In the context of supervised learning with a one-hidden layer neural network in the mean-field regime, these conditions are reflected in suitable integrability and regularity assumptions on the loss and activation functions.

artificial intelligence, generalization error, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2306.11623

Country: